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With the rapid increase in low-cost and sophisticated digital technology the need for techniques to authenti¬ 
cate digital material will become more urgent. In this paper we address the problem of authenticating digital 
signals assuming no explicit prior knowledge of the original. The basic approach that we take is to assume 
that in the frequency domain a "natural" signal has weak higher-order statistical correlations. We then show 
that "un-natural" correlations are introduced if this signal is passed through a non-linearity (which would 
almost surely occur in the creation of a forgery). Techniques from polyspectral analysis are then used to 
detect the presence of these correlations. We review the basics of polyspectral analysis, show how and why 
these tools can be used in detecting forgeries and show their effectiveness in analyzing human speech. 
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1 Introduction 


signal 


signal 


The Federal Rules of Evidence outlines the requirements 
for the introduction of material in a United States court 
of law [1]. These Rules cover various aspects relating to 
the introduction of audio recordings and photographs as 
evidence. The courts have stated that to prove the con¬ 
tent of a recording or photograph, the original is required 
(Article X, Rule 1002). Where Rule 1001 of the same arti¬ 
cle defines original as: 

An "original" of a writing or recording is the 
writing or recording itself or any counterpart 
intended to have the same effect by a person 
executing or issuing it. An "original" of a 
photograph includes the negative or any print 
therefrom. If data are stored in a computer or 
similar device, any printout or other output read¬ 
able by sight, shown to reflect the data accurately, 
is an "original". 

Given the ease with which digital material can be manip¬ 
ulated it is surprising that the courts have adopted such 
a broad definition of original. To contend with the tech¬ 
nical and legal issues involved in the use of digital ma¬ 
terial as evidence the courts have turned to the Depart¬ 
ment of Justice Computer Crime and Intellectual Prop¬ 
erty Section (CCIPS). Section VIII of the CCIPS Federal 
Guidelines for Searching and Seizing Computers points 
out many of the concerns involved in the use of digital 
media as evidence, but in the end is left to simply con¬ 
clude: 

For the time being, however, most computer 
evidence can still be altered electronically - 
in dramatic ways or in imperceptible detail 
- without any sign of erasure. But this does 
not mean that electronic evidence, having become 
less distinctive, has become any less admissible. It 
simply may require ns to authenticate it in other 
ways. 

With the rapid increase in low-cost and sophisticated 
digital technology the need for techniques to authenti¬ 
cate digital material will become more urgent. In this 
paper we address the problem of determining whether a 
digital signal has been tampered with from the time of 
its recording. As illustrated in Figure 1 our work differs 
from recent work in digital watermarking where the goal 
is to embed in a signal an imperceptible and distinctive 
pattern that is impervious to a broad class of manipu¬ 
lations; the watermark is stored and later used for au¬ 
thentication (see [2] for a review). Such techniques have 
primarily been developed for protecting copyrights. Our 
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Figure 1: Unlike other authentication techniques (right) 
we are interested in verifying the authenticity of digi¬ 
tal signals (left) with no explicit prior knowledge of the 
original. 


work is also distinct from the vast authentication litera¬ 
ture which encodes and encrypts certain aspects of what 
is assumed to be the authentic signal; this signature is 
later used for authentication (see [3] for a review). These 
techniques have been popular in trying to ensure that 
signals are not tampered with during transmission over 
insecure lines. 

Unlike both of these approaches we assume no ex¬ 
plicit prior knowledge of the original signal. The basic 
approach that we take to detecting digital forgeries is to 
assume that in the frequency domain a "natural" signal 
has weak higher-order statistical correlations. We then 
show that if this signal is passed through a non-linearity 
(which would almost surely occur in the creation of a 
forgery), that "un-natural" higher-order correlations are 
introduced (in magnitude and phase). Techniques from 
polyspectral analysis [4] are then used to detect the pres¬ 
ence of these correlations, and used as an indication of 
digital tampering. This paper focuses primarily on ana¬ 
lyzing human speech: we show that in the frequency do¬ 
main such signals have weak higher-order correlations, 
but that when tampered with such correlations are intro¬ 
duced. To begin, we introduce the necessary tools from 
polyspectral analysis and then show how and why these 
tools can be useful in detecting forgeries. Several exam¬ 
ples showing the effectivness of this approach are given. 
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Figure 2: A simple non-linearity introduces new har¬ 
monics with higher-order correlations. 


2 Bispectral Analysis 

For purposes of illustration consider a signal composed 
of a sum of two sinusoids: 

x(n) = sin(win + <f>\) + silicon + 0 2 ). (1) 


where Y* (w) is the complex conjugate. However the power 
spectrum is blind to higher-order correlations of the sort 
of interest to us. However these correlations can be de¬ 
tected by turning to higher-order spectral analysis. For 
example the bispectrum is used to detect the presence of 
third-order correlations: 

B(u> i,w 2 ) = y(wi)y(w 2 )y*(wi+w 2 ). (6) 

Comparing the bispectrum with Equation (3) and Fig¬ 
ure 2 we can see intuitively that the bispectral response 
reveals the sorts of "un-natural" higher-order correlations 
introduced by a non-linearity. That is, correlations be¬ 
tween the triples [w \, wi , wi +wi], [wj, w 2 , w 2 +w 2 ], [wj, w 2 , wi + 
w 2 ], and [ui, -w 2 , wi - w 2 ]. 

From an interpretive stance it will be convenient to ex¬ 
press the complex bispectrum with respect to its magni¬ 
tude: 

|E(w!,wo)| = |y(w!)| • |y(wo)| • |y(w! +wo)|, (7) 

and phase: 


Consider now passing this signal through a simple non¬ 
linearity: 

y(n) = x 2 (n)+x(n). (2) 

Expanding the polynomial and rewriting in terms of the 
harmonics using basic trigonometric identities gives: 

y(n) = i(l +sin(2wm + 201 - §)) 

+ |(1 + sin(2won + 20 2 - |)) 

+ 2 sin((wi + wo) + (0i + 02 )) 

+ 2 sin((wi — wo) + (0i — 0o)) 

+ sin(win + 0i) + sin(w 2 n + 0 2 ). (3) 

Notice that this simple non-linearity introduces new har¬ 
monics with frequencies and phases that are correlated 
(Figure 2). 1 In addition, these correlations simply could 
not have been introduced through a linear transform. Our 
goal now is to try to detect these "un-naturaF' higher- 
order correlations as a means of detecting the presence 
of a non-linearity. 

The signal is first decomposed according to the Fourier 
transform: 

OO 

y(w) = ]T ;/(/,•). '*•% (4) 

k=—oo 

with w G [—7r, 7r]. It is common practice to use the power 
spectrum to detect the presence of second-order correla¬ 
tions: 

p(w) = y(w)y*(w), (5) 


ZP(Wi, W 2 ) — ZT (w’i) + ZT (wo) — ZT (wi + W 2 ). (8) 


Also from an interpretive stance it is helpful to work with 
the normalized bispectrum, the bicoherence: 


B c (u> 1 , wo) 


y(wi)y(w 2 )y*(wi + w 2 ) 
N /|y(wi)y(w 2 )|2|y(w 1 +w 2 )| 2 ' 


There is much debate in the literature on the "correct" 
normalization factor; this particular form [6] is adopted 
because it guarantees that the bicoherence values are in 
the range [0,1]. In the absence of noise the bicoherence 
can be estimated from a single realization as in Equa¬ 
tion (9). However in the presence of noise some form 
of averaging is required to ensure stable estimates. A 
common and convenient form of averaging is to divide 
the signal into multiple segments. For example the sig¬ 
nal y(n) with n G [1, N] can be divided into K segments 
of length M = N/K, or K overlapping segments with 
M > N/K. The bicoherence is then estimated from the 
average of each segment's bicoherence spectrum: 


P C (W! Wo) = pE fc n(^)y fc (W2)y0(w 1 + W2) 

\J-kT.k \ y k(ul)Yk(^2)\' 2 E* \ y k(^l + ^2)| 2 

( 10 ) 

Correlations beyond second- and third-order can also 
be detected. For example, the trispectrum is sensitive to 
fourth-order correlations: 


1 An observation that Feynman mentioned would "have many prac- w.*, , ,,,, 

tical applications". [5] T(w a S W 2 , W 3 ) = T (wi)T (w 2 )l (w 3 )l (W 1 +W 0 +W 3 ). (11) 
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And in general the N th -order polyspectrum is sensitive 
to N + I s *-order correlations. These polyspectra can also 
be computed for higher-dimensional signals. For exam¬ 
ple, a 2-D signal with Fourier transform Y(w x ,u y ) has a 
4-D bispectrum: 

•> ^yl ; , ^y'f) — 

T [u) x \, UJyi)} ( LO x -2,LOy2 )1 *(w.cl +U> x :2,idyl +L0 y o). (12) 

For now we will concentrate on the use of the bispec¬ 
trum/bicoherence for detecting the presence of "un-natural 
correlations brought about through some form of non¬ 
linear tampering. 

3 Detecting Forgeries 

In the previous section we saw that a simple squaring 
non-linearity applied to the sum of two sinusoids intro¬ 
duces new harmonics with distinct higher-order corre¬ 
lations between the individual harmonics. Equation (3). 
We also saw how the bispectrum/bicoherence can be used 
to detect these correlations. Equation (6). If such higher- 
order correlations are weak in "natural" signals then their 
presence can be used as an indication of tampering thus 
casting the authenticity of the signal into a suspicious 
light. In the next section we provide some empirical evi¬ 
dence that for human speech, these higher-order correla¬ 
tions are in fact relatively weak. 

But first we need to show why we might expect in¬ 
creased activity in the bicoherence when an arbitrary non¬ 
linearity is applied to an arbitrary signal. To this end con¬ 
sider first an arbitrary function /(•): 

y{n) = f{x{n)). (13) 

For notational convenience this system is expressed in an 
equivalent vector form as y = f(x). The vector valued 
function /(•) can be expressed in terms of scalar valued 
functions: 

fix) = ifi(x) f 2 (x) ... f N {x)). (14) 

Each of these scalar valued functions /,;(•) can subsequently 
be expressed in terms of its Taylor series expansion about 
a point p: 

m = m + £H*;+iE + ■ (15) 

j j,k 

For simplicity this expansion is truncated after the second- 
order term and rewritten in vector/matrix form as: 

fi(x) ~ a + bx + —xCx, (16) 


with the scalar a = fi(p), the vector bj = |-, and 

the matrix [C] jk = \ p , with i,j,k G [1,1V]. The 

point of all of this is simply that the application of an 
arbitrary non-linearity f(x) results in an output that con¬ 
tains a sum of a linear term bx and a quadratic term xCx. 
Note the similarity with the original example in the pre¬ 
vious section. Equation (2). 

Consider now the input signal x expressed in terms of 
its Fourier series: 

OO 

x(n) = Ya k sin (kn + <:>/,.)■ (17) 

k =0 

The situation is now very similar to that of the previous 
section. Equations (1) and (2). That is, the application of 
an arbitrary non-linearity /(•) to an arbitrary signal x(n) 
will result in higher-order correlations between various 
harmonics, and hence an increased bicoherence activity. 

In fact, even more specific predictions can be made by 
looking closely at how a non-linearity affects the bicoher¬ 
ence magnitude and phase. Recall that a non-linearity 
introduces new harmonics that are correlated with the 
original harmonics. Figure 2. For example, given a pair 
of harmonics oj 1 and uj- 2 , a non-linearity produces a new 
harmonic uj\ + u-, whose amplitude is correlated to 
and ujn. To the extent that such correlations do not oc¬ 
cur naturally, these correlations will result in a larger re¬ 
sponse in the bicoherence magnitude, Equation(7). In 
addition, if the initial harmonics have phases pi and 62 , 
then the phase of the newly introduced harmonic is pi + 
p 2 . It is then easy to see that the bicoherence phase for 
this pair of harmonics will be 0, Equation (8). Again, to 
the extent that these higher-order correlations do not oc¬ 
cur naturally, the correlations will result in a bias of the 
bicoherence phase towards 0. A phase bias towards tt/2 
may also occur due to correlations of any harmonic uj\ 
with itself. That is, the bicoherence phase loi) = 

pi + pi - (2pi - 7 t/2) = 7t/2. 

We predict that a non-linearity will manifest itself in 
the bicoherence in two ways: for certain harmonics an in¬ 
crease in the magnitude, and for those harmonics a phase 
bias towards 0 and/or it/ 2. In the following section we 
will show how these principles can be used in the detec¬ 
tion of digital forgeries. 

4 Results 

We first show how the bicoherence can detect the pres¬ 
ence of global and local non-linearities of the form de¬ 
scribed in Section 2. We then show how these ideas gen¬ 
eralize to a broader class of non-linearities that might be 
involved in the creation of digital forgeries. 
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Figure 3: Detecting non-linearities. Shown from left 
to right is a portion of the input signal y(n) = 
ax 2 (n ) + x(n), its normalized power spectrum P(u>) 
with uj £ [—7T,7r], its bicoherence B c (lui,lu 2 ) with 
uj-t. u :-2 £ [—7T, 7 r], and the bicoherence phase histogram 
plotted from [—7r, 7r]. As the non-linearity increases 
there is no change in the power spectrum, but a signif¬ 
icant increase in the overall magnitude and phase bias 
(see also Figure 4). 


4.1 Global Non-Linearities 

Throughout these examples the input is a 1-D fractal sig¬ 
nal x(n) with 2048 samples and a power spectral density 
of the form |A'(w)| = 1/uj. In this first set of experiments 
a signal is subjected to a global non-linearity of the form: 

y(n) = ax 2 {n) + x{n), (18) 



phase bias 



Figure 4: Bicoherence averaged over one-hundred in¬ 
dependent signals of the form ax 2 (n) + x(n) with a £ 
[0.0,1.0]. The error bars correspond to the standard er¬ 
ror. As the degree of non-linearity increases both the 
magnitude and phase bias increase (see also Figure 3). 


Shown in the first column of Figure 3 are several sig¬ 
nals with varying amounts of non-linearity, i.e., increas¬ 
ing values of a in Equation (18). The signals are all plot¬ 
ted on the same scale of [0,1]. Shown within each row is 
a small portion of the input signal y(n), the normalized 
power spectrum, the bicoherence magnitude \B c (uji , lo- 2 )\, 
and a histogram of the bicoherence phase IB c (uji , uj->) for 
those frequencies with (cui, c^o) | > 0.2. With increas¬ 
ing amounts of non-linearity there is an increase in the 
magnitude and phase bias towards zero. At the same 
time the normalized power spectrum remains virtually 
unchanged - it is blind to the higher-order correlations 
introduced by the non-linearity. 

The increase in bicoherence activity can be quantified 
by measuring the mean magnitude and the phase bias. 
To measure the phase bias away from uniform we com¬ 
pute the variance across the bin counts from a discrete 
histogram of the phases (i.e., deviations from a uniform 
distribution will have a large variance). To show the gen¬ 
eral robustness of this measure the bicoherence is mea¬ 
sured for one hundred independent fractal signals with 
varying levels of non-linearity. Shown in Figure 4 are 
the mean magnitude and phase bias plotted as a func¬ 
tion of increasing amounts of non-linearity. Each data 
point corresponds to the average response over the one 
hundred trials, the error bars correspond to the standard 
error. With increasing amounts of non-linearity the mean 
magnitude and phase bias increase by 35% and 165%, re¬ 
spectively. 


with a £ [0,1]. The bicoherence is computed by divid¬ 
ing the signal into 127 overlapping segments of length 
32 with an overlap of 16. A 128-point windowed DFT, 
Yk(uj), is estimated for each segment from which the bi¬ 
coherence is estimated, B c (uji, uio) (Equation (10)). 2 

2 The signal is windowed with a symmetric Hanning window prior 
to estimating the DFT. 


4.2 Local Non-Linearities 

The bicoherence is able to reliably detect the presence of 
a simple global non-linearity (Figures 3 and 4). For the 
purposes of forgery detection it would also be desirable 
to be able to detect the presence of local non-linearities. 
To test whether this is possible a small portion of a sig- 
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Figure 5: Detecting non-linearities. Shown in the first 
column is the same input signal, where the middle gray 
region was subjected to a squaring non-linearity, Equa¬ 
tion (2). Shown to the right of each signal is the bicoher¬ 
ence magnitude and phase histogram for seven over¬ 
lapping windows. Notice the increased activity in the 
central segment that includes the non-linearity. 

nal was exposed to the same squaring non-linearity as 
above (a = 1 in Equation (18)). The full signal contains 
8192 samples, with the central 512 segments subjected to 
the non-linearity. Using the same methodology as above 
the bicoherence is computed at seven different windows 
along the signal, where each window contains 2048 sam¬ 
ples with an overlap of 1024. Shown in Figure 5 is the 
bicoherence magnitude and phase. For the central (and 
one neighboring) window containing the non-linear seg¬ 
ment there is an increased activity in both the magnitude 
and phase bias; for window four the mean magnitude 
is 0.10 and the phase bias is 42.7. Relative to the global 
squaring in Figure 4 this amounts to a non-linearity on 
the order of a = 0.6. 


4.3 Synthetic Forgeries 

We have seen how the bicoherence is able to detect "un¬ 
natural" higher-order correlations caused by a simple squar¬ 
ing non-linearity (Figures 3 and 5). We would now like to 
see how well this generalizes to arbitrary non-linearities. 

A common technique in digital forging is to splice to¬ 
gether signals in such a way that the seam is not per¬ 
ceptually salient. A pair of fractal signals were seam¬ 
lessly spliced together using a Faplacian pyramid [7]. 
Each signal, x\ (n) and ,r 2 (n), containing 8192 samples 
is first decomposed into a seven level Faplacian pyra¬ 
mid. A new pyramid is constructed by combining, at 
each pyramid level, the left half of x\ in) with the right 
half of x-i(n). This new pyramid is then collapsed yield¬ 
ing the "forgery". The rationale for this scheme is that 
the splicing looks seamless because the low frequencies 
are blended over a large spatial extent, while the details 
(high frequencies) are preserved by blending over a smaller 
extent. Using the same methodology as described above 
the bicoherence is computed at several windows along 
the forged signal. Shown in Figure 6 is the bicoherence 
magnitude and phase for each window. Notice the in¬ 
creased activity in the central region where the splice oc¬ 
curred; the mean magnitude is 0.11 and the phase bias 
is 54.7. Relative to the global squaring non-linearity this 
amounts to a non-linearity on the order of a = 0.8 (Fig¬ 
ure 4). 

The robustness of this detection was tested over one- 
hundred independent signals spliced together in the man¬ 
ner described above. For each signal the bicoherence is 
measured across the splice point (window four in Fig¬ 
ure 6). Shown in Figure 7 is the mean magnitude and 
phase bias averaged over all the trials. Also shown are 
the average values for signals that were not subjected to 
any tampering. On average there is a 26% increase in the 
mean magnitude and 72% increase in phase bias. 

4.4 Audio Forgeries 

We analyzed audio recordings from the four volume col¬ 
lection of Great Speeches of the 20 th Century (Rhino Records 
Inc, 1991). The first ten seconds of twenty speeches were 
digitized. As described in previous sections the bicoher¬ 
ence is estimated for each signal at approximately 200 
non-overlapping windows of length 2048. Shown in Fig¬ 
ure 8 are the list of speakers and the average magnitude 
and phase bias. With the exception of one speaker (J. 
Jackson), the average magnitude and phasebias are 0.109 
and 38.2 with a standard error of 0.002 and 3.1, respec¬ 
tively. Again with the exception of one speaker, the av¬ 
erage number of windows that contained a magnitude 
and phase bias above "normal" is just under 6%. Where 


6 





Figure 6: Shown in the first column is the same input 
signal consisting of a pair of signals spliced together. 

The dotted line represents the true continuation of the 
left-half of the signal. Shown to the right of each sig¬ 
nal is the bicoherence magnitude and phase histogram 
for seven overlapping windows. Notice the increased 
activity in the central segment where the splicing oc¬ 
curred (see also Figure 7). 

a magnitude above 0.15 and phase bias above 50 is con¬ 
sidered to be outside of the normal range. Figure 9. With 
respect to the simulations in the previous sections (Fig¬ 
ure 4) these natural signals register as containing mid¬ 
range non-linearities. This is not surprising given the in¬ 
herent non-linearities in the recording process and the 
non-stationarity of the signals. What is encouraging is 
that even over such a broad range of signals the responses 
are quite similar thus providing a baseline for future com¬ 
parisons. 

Given the numerous ways in which a forgery can be 
made there is of course no way of systematically testing 
the overall efficacy of the bicoherence. Nevertheless, we 
provide one example starting with the following portion 
of a rambling speech by Dan Quayle: 


signal 

magnitude 
mean std. err. 

phase bias 
mean std. err. 

natural 

0.087 

0.0001 

16.1 

1.0 

forgery 

0.11 

0.001 

27.8 

1.5 


Figure 7: Bicoherence averaged over one-hundred 
"natural" and forged signals. Notice that the forged sig¬ 
nals have an increased magnitude and phase bias (see 
also Figure 6). 


Our government unlike many governments 
and particularly the governments of where 
the people that founded this country came 
from is a government that is derived from the 
people that consent to govern the freedom that 
is based in the people that then elect their rep¬ 
resentatives to represent them in a free repre¬ 
sentative democracy that we have today. 

Dan Quayle 

Several splices were made to this speech and rejoined 
with local smoothing to yield the more coherent mes¬ 
sage: 

Our government is a government derived from 
the people that then elect their representatives 
in a democracy that we have today. 

Forgery 

The average bicoherence magnitude and phase bias jumped 
from within normal range, 0.169/41.0, to well outside of 
the normal range, 0.234/69.2, Figure 9. Even though sev¬ 
eral people listening to both audio clips were unable to 
identify the forgery there is a significant increase in the 
bicoherence activity that would have tagged this record¬ 
ing as suspicious. 

In addition to analyzing forgeries made by tamper¬ 
ing with natural speech we were curious if the very un¬ 
natural sounding computer generated speech would also 
be tagged as such. We analyzed the short answering ma¬ 
chine message "Hello, you have no messages" spoken in 
a computer generated voice. The bicoherence magnitude 
and phase bias is 0.225/76.0 and fell well outside of the 
normal range, 'x' in Figure 9. Although not a forgery 
in the traditional sense, the computer generated voice 
clearly does not sound like natural human speech. 

Finally, the speech of the Reverend Jesse Jackson is 
incorrectly tagged as suspicious (the filled circle in top 
right quadrant of Figure 9). We hypothesize that this is 
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0.3 


speaker 

average 
mag phase 

% 

Hank Aaron 

0.121 

41.5 

9.8 

Spiro Agnew 

0.112 

34.9 

6.5 

Winston Churchill 

0.104 

31.1 

4.1 

Mario Cuomo 

0.114 

39.8 

10.9 

Mayor Richard Daley 

0.118 

41.6 

9.5 

Ameila Earhart 

0.101 

22.9 

0.5 

Dwight D. Eisenhower 

0.109 

68.0 

8.5 

Barry Goldwater 

0.113 

51.8 

9.5 

Jesse Jackson 

0.182 

88.0 

38.5 

Lyndon B. Johnson 

0.104 

25.2 

2.0 

Edward Kennedy 

0.111 

41.2 

6.1 

John F. Kennedy 

0.105 

32.3 

4.1 

Martin Luther King 

0.097 

23.0 

0.5 

Charles Lindbergh 

0.118 

46.8 

9.0 

Douglas MacArthur 

0.118 

46.5 

6.6 

Joseph McCarthy 

0.105 

53.1 

8.9 

Richard M. Nixon 

0.096 

26.3 

0.5 

Gloria Steinem 

0.118 

48.7 

9.2 

Harry S. Truman 

0.096 

12.8 

1.0 

Malcolm X 

0.107 

38.5 

6.2 


Figure 8: Speaker data set. Shown in the second and 
third columns are the average bicoherence magnitude 
and phase bias. Shown in the last column is the percent¬ 
age of windows where both the magnitude and phase 
bias were above "normal" (see also Figure 9). 


due to significant amounts of feedback in the recording 
that was not present in any of the other recordings. This 
false positive illustrates an important limitation of look¬ 
ing to the bicoherence as a means of identifying digital 
forgeries. Namely, that we are unable to distinguish be¬ 
tween innocuous non-linearities and ones that are meant 
to deceive the listener. In addition, a sufficiently sophis¬ 
ticated forger could mask any malicious tampering by 
simply adding a harmless global non-linearity to the sig¬ 
nal. While it may not yet be possible to differentiate 
between various types of non-linearities, their presence 
should at a minimum cast the authenticity of the signal 
into a suspicious light. 


5 Discussion 

This paper addresses the problem of determining whether 
a digital signal has been tampered with from the time of 
its recording. Unlike techniques in digital watermark- 
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Figure 9: Shown is the average bicoherence magnitude 
and phase bias for the 20 speakers of Figure 8 (filled 
circles). The open circles correspond to a speech by Dan 
Quayle before (top left quadrant) and after tampering 
(top right quadrant). The 'X' corresponds to a computer 
generated voice. Points in the top right quadrant are 
tagged as suspicious. 



ing and authentication, explicit knowledge of the orig¬ 
inal signal is not assumed. It is however assumed that 
in the frequency domain "natural" signals have weak 
higher-order correlations. To the extent that this assump¬ 
tion holds, we showed that "un-natural" higher-order 
correlations are introduced when a signal is passed through 
a non-linearity (which would almost surely occur in the 
creation of a forgery). Tools from polyspectral analysis 
(bispectrum/bicoherence) are employed to detect these 
correlations and used as an indicator of digital tamper¬ 
ing. More specifically we showed that a non-linearity 
manifests itself with an increase in the bicoherence mag¬ 
nitude and a bias in the bicoherence phase towards 0 or 
7r / 2. We demonstrated the applicability of this technique 
by first showing that for human speech signals higher- 
order correlations are in fact weak, and that tampering 
with such signals is revealed in the bicoherence. 

Undoubtedly there has been significant efforts by the 
law enforcement community in developing techniques 
for digital forgery detection. However to maximize their 
effectiveness much of this work remains unpublished. 
Although there are disadvantages to revealing this new 
technique, an advantage of analyzing higher-order sta¬ 
tistical correlations is that their presence is most likely 
imperceptible to the visual system, and it is not immedi¬ 
ately obvious how they can be removed by manipulating 
the signal in the spatial domain. So while a forger maybe 
aware of this detection scheme, they may not be able to 
remove the correlations introduced by their tampering. 
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Another advantage of the bicoherence is that it is not ef¬ 
fected by additive white noise that is often added to an 
image to disguise evidence of tampering. 

There are of course limitations to what the bicoher¬ 
ence is capable of detecting. For example, highly local¬ 
ized and minor tampering will most likely go unnoticed, 
then again, such manipulations are unlikely to dramati¬ 
cally alter the meaning of a signal. Additionally, rather 
innocuous factors can lead to an increased activity in the 
bicoherence, for example, significant non-linearities in 
the recording process. As a result, the detection of a non¬ 
linearity does not immediately imply the discovery of a 
forgery. However it should give reason to question the 
authenticity of the signal, suggesting that further inspec¬ 
tion is warranted. 

Finally, polyspectral analysis is not limited to only third- 
order statistics of one-dimensional signals. We are ex¬ 
tending our analysis to higher-order statistics and apply¬ 
ing these techniques to the analysis of digital images and 
video sequences. 


[7] E.H. Adelson and P.J. Burt. Image data compres¬ 
sion with the Laplacian pyramid. In Proceedings of the 
conference on pattern recognition and image processing, 
pages 218-223, Dallas, TX, 1981. 
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